|
Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from parametric distributions. For example, robust methods work well for mixtures of two normal distributions with different standard-deviations; under this model, non-robust methods like a t-test work badly. == Introduction == Robust statistics seeks to provide methods that emulate popular statistical methods, but which are not unduly affected by outliers or other small departures from model assumptions. In statistics, classical estimation methods rely heavily on assumptions which are often not met in practice. In particular, it is often assumed that the data errors are normally distributed, at least approximately, or that the central limit theorem can be relied on to produce normally distributed estimates. Unfortunately, when there are outliers in the data, classical estimators often have very poor performance, when judged using the ''breakdown point'' and the ''influence function'', described below. The practical effect of problems seen in the influence function can be studied empirically by examining the sampling distribution of proposed estimators under a mixture model, where one mixes in a small amount (1–5% is often sufficient) of contamination. For instance, one may use a mixture of 95% a normal distribution, and 5% a normal distribution with the same mean but significantly higher standard deviation (representing outliers). Robust parametric statistics can proceed in two ways: *by designing estimators so that a pre-selected behaviour of the influence function is achieved *by replacing estimators that are optimal under the assumption of a normal distribution with estimators that are optimal for, or at least derived for, other distributions: for example using the ''t''-distribution with low degrees of freedom (high kurtosis; degrees of freedom between 4 and 6 have often been found to be useful in practice ) or with a mixture of two or more distributions. Robust estimates have been studied for the following problems: :estimating location parameters :estimating scale parameters :estimating regression coefficients :estimation of model-states in models expressed in state-space form, for which the standard method is equivalent to a Kalman filter. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Robust statistics」の詳細全文を読む スポンサード リンク
|